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STATISTICAL TECHNIQUES - Il 


TESTS OF SIGNIFICANCE 
The Test of significance enable us to decide, on the basis of the results of the sample, whether 


e the deviation between the observed sample statistic and the hypothetical parameter 
value or 

e the deviation between two sample statistics is significant or might be attributed due to 
chance or the fluctuations of the sampling. 


Null hypothesis 


For applying the tests of significance, we first set up a hypothesis which is a definite statement 
about the population parameter called Null hypothesis denoted by Ho. 


Alternative hypothesis 


Any hypothesis which is complementary to the null hypothesis (Ho) is called an Alternative 
hypothesis denoted by Hı. 


Example 


For example, if we want to test the null hypothesis that the population has a specified mean yo, 
then we have 


Ho: u = po 
Alternative hypothesis will be 


e Hi: uw > wo or p< po (two tailed alternative hypothesis). 
e Hi: u> uo (right tailed alternative hypothesis or single tailed). 
e Hi: u< uo (left tailed alternative hypothesis or single tailed). 


Hence alternative hypothesis helps to know whether the test is two tailed test or one tailed 
test. 


Critical region (region of rejection) 


A region corresponding to a statistic t, in the sample space S which amounts to rejection of the 
null hypothesis Ho is called as critical region or region of rejection. 


Acceptance region 
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A region corresponding to a statistic t, in the sample space S which amounts to acceptance of 


the null hypothesis Ho is called as acceptance region. 


Level of significance 


The probability a that a random value of the statistic t belongs to the critical region is known as 


the level of significance. 


P(t E€ œ| Ho) =a 


Errors in sampling 


e Type lI Error : It is the error of rejecting null hypothesis when It is true. When a null 


hypothesis is true, but the difference (of mean) is significant and the hypothesis is 


rejected, then a Type I Error is made. The probability of making a type I error is 


denoted by a, the level of significance. In order to control the type I error, the 


probability of type I error is fixed at a certain level of significance a. The probability of 


making a correct decision is then (1 - a). 


e Type II Error: It is the error of accepting the null hypothesis Ho when it is false. In 


other words when a null hypothesis is false, but the difference of means is insignificant 


and the hypothesis is accepted, a type II error is made. The probability of making a type 


II error is denoted by B. 


The following summary table in which H denotes the tested hypothesis may help fix the 


concepts of the two kinds of error: 


Truth 
Ho is true Ho is false 
Decision Accept Ho | Correct decision | Type II error (B) 
Based on data | Reject Ho | Type I error (a) | Correct decision 
Right tailed test Left tailed test 
Acceptance 
Acceptance iecti i 
p Rejection Rejection region 
region { region 
region (a) 


(a) 


z=02=4% 
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Standard Error 


The standard deviation of the sampling distribution of a statistic is known as the standard error 
(S.E.). It plays an important role in the theory of large samples and it forms a basis of the 
testing of hypothesis. If t is any statistic, for large sample 


_t-E(t) 
S.E(t) 
is normally distributed with mean 0 and variance unity. 


The critical value of z at different levels of significance (a) for both single tailed and two tailed 
test are calculated and listed below. 


Value of Za at 5% level of significance 
For two tailed test = 1.966 
For right tailed = 1.645 
For left tailed = -1.645 


Steps in testing of statistical hypothesis 


1. Null hypothesis. Set up Ho in clear terms. 
2 Alternative hypothesis. Set up Hı, so that we could decide whether we should use 
one tailed test or two tailed test. 
3. Level of significance. Select the appropriate level of significance in advance 
depending on the reliability of the estimates. 
io = -E . 
4. Test statistic. Compute the test statistic z = i i under the null hypothesis. 
t 
5. Conclusion. Compare the computed value of z with the critical value Za at level of 


significance (a). 
a. If|z|> Za, we reject Ho and conclude that there is significant difference. 
b. If|z|< Za, we accept Ho and conclude that there is no significant difference. 


TESTING OF SIGNIFICANCE FOR LARGE SAMPLES 


If the sample size n > 30, the sample is taken as large sample. For such sample we apply 
normal test, as Binomial, Poisson, chi square, etc. are closely approximated by normal 
distributions assuming the population as normal. 


Under large sample test, the following are the important tests to test the significance: 


e Testing of significance for single proportion. 
e Testing of significance for difference of proportions. 
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e Testing of significance for single mean. 
e Testing of significance for difference of means. 
e Testing of significance for difference of standard deviations. 


Testing of Significance for Single Proportion 


This test is used to find the significant difference between proportion of the sample and the 
population. Let X be the number of successes in n independent trials with constant probability 
P of success for each trial. 


Under Ho test statistic 
p-P 
JPO/n 


where p = proportion of success in the sample. 


Example 


A coin was tossed 400 times and the head turned up 216 times. Test the hypothesis that the coin 
is unbiased. 


Solution 
Ho: The coin is unbiased i.e., P = 0.5. 
Hı: The coin is not unbiased (biased); P # 0.5 
Here n = 400; X = Number of success = 216 
p = proportion of success in the sample = X/n = 216/400 = 0.54. 
Population proportion (P) = 0.5 
Q=1-P=1-05=0.5 


p-P__|0.54-0.5|_ 1 ¢ 


JPQin | fosx05| ` 
y 400 


As |z| = 1.6 < 1.96 — [|z| < Zai.e. 1.96 which is value of z at 5% significance for two tailed 


Under Hp, test statistic | z |= 


test. 


The coin is unbiased. 


Problem 


A coin was tossed 400 times and the head turned up 220 times test the hypothesis that the coin 
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is unbiased. 


Answer: The coin is unbiased 


Example (JNTUK 2009, 2010, 2017, BPUT 2020 type) 
In a sample of 1000 people in Karnataka 540 are rice caters and the rest are wheat caters. Can 
we assume that both rice and wheat are equally popular in this state at 1% level of 
significance? 
Solution 
n = 1000 
p = Sample proportion of rice caters = 540/1000 = 0.54 
P = Population proportion of rice caters = 1/2 = 0.5 
Q=1-P=1-0.5=0.5 
Null Hypothesis Ho : Both rice and wheat arc equally popular in the state. 
Alternative Hypothesis Hı : P + 0.5 (two - tailed alternative) 
p-P 0.54-0.5 
Po (eas 
n 1000 


The calculated value of z = 2.532 


Test statistic z = 2.532 


The tabulated value of z at 1% level of significance for two-tailed test is 2.58. 


Since calculated z < tabulated z, we accept the Null Hypothesis Ho at 1% level of significance 
and conclude that both rice and wheat arc equally popular in the state. 


Problem (JNTUK 2009, 2010, 2013) 


In a big city 325 men out of 600 men were found to be smokers. Docs this information support 
the conclusion that the majority of men in this city arc smokers ? 


Answer: z > zo.s i.e. 2.04 > 1.645 (right tailed), Reject Ho. 


Testing of Significance for Difference of Proportions 


Consider two samples X; and X2 of sizes nı and no, respectively, taken from two different 
populations. Test the significance of the difference between the sample proportion pı and p2. 
The test statistic under the null hypothesis Ho, that there is no significant difference between the 
two sample proportion, we have 
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z= Pı- P2 
n Ny 
+ 
where pa MhP2 9 =] P 
n +n, 


Example 


A machine produced 16 defective articles in a batch of 500. After overhauling it produced 3 
defectives in a batch of 100. Has the machine improved? 


Solution 
pi = 16/500 = 0.032; ni = 500; p2 = 3/100 = 0.03; n2 = 100 
Null hypothesis Ho: The machine is not improved due to overhauling. Hence pi = p2. 
Hi: pı > p2 (right tailed) 
_ 0.032x500 + 0.03x100 


= 0.032 
500 + 100 
Under Ho, the test statistic z = a -a a S s ___ =0.104 
1 1 1 1 
Ca 0.032)(0.968)| —— +— 
ro | + 1) (0.032)( le 5) 


We see that the calculated value of | z | < 1.645, the significant value of z at 5% level of 
significance (for right tailed test), Ho is accepted, i.e., the machine has not improved due to 
overhauling. 


Example (UTU 2013, 5 marks) 


Before an increase in excise duty on tea, 800 people out of 1000 persons were found to be tea 
drinkers. After an increase in duty 800 persons were known to be tea drinkers in a sample of 
1200. Do you think that there has been significant decrease in the consumption of tea after the 
increase in excise duty? 


Solution 
Given nı = 1000; n2 = 1200 
pı = proportion of tea drinkers before increase in excise duty = 800/1000 = 0.8 


p2 = proportion of tea drinkers after increase in excise duty = 800/1200 = 0.67 
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pa MP tMP. _ 1000.x0.8 +1200x0.67 _ 0.73 
n +n, 1000 +1200 


Q=1-P=10.73 =0.27 
Ho: pı = p2 (i c., there is no significant difference in the consumption of tea before and after the 
increase in excise duty) 
Hı : pi > p2 (.e., Right tailed test) 
Under test statistic 


me PH P» B 0.8 —0.67 


i. 4 1 1 
La) 70I —— 4 
ro +++) Á errr o) 


1 2 


From table, Zo.o5 = 1.645 (right tailed) 


Since z > 1.645, Ho is rejected at 5% level of significance level of significance, (i.e., there is no 
significant difference between the people consuming tea before and after increase in excise 
duty) 


Problem 


In a random sample of 100 men taken from village A, 60 were found to be consuming alcohol. 
In another sample of 200 men taken from village B, 100 were found to be consuming alcohol. 
Do the two villages differ significantly in respect to the proportion of men who consume 
alcohol? The table value of z at 5% level is 1.96 (two tailed test). 


Answer: z = 1.64 < 1.96, Ho will be accepted. 


Testing of Significance for Single Mean 


To test whether the given sample of size n has been drawn from a population with mean y, i.e. 
to test whether the difference between the sample mean and the population mean is significant. 
Under the null hypothesis, there is no difference between the sample mean and population 
mean. 


H , where o is the standard deviation of the population. 


ol Vn 


The test statistic is z= 


Example 


A normal population has a mean of 6.8 and standard deviation of 1.5. A sample of 400 
members gave a mean of 6.75. Is the difference significant? 
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Solution 
Ho: There is no significant difference between x and y. 
Hı: There is significant difference between x and y. 
Given u = 6.8, 0 = 1.5, x =6.75 and n = 400 
6.75-6.8]| | 0 


z=] = 0. 
iz m 


As the calculated value of | z | < Za = 1.96 at 5% level of significance, Ho is accepted, i.e., there 
is no significant difference between x and u. 


Example (JNTUH 2018, 10 marks) 


A sample of 400 items is taken from a normal population whose mean is 4 and variance 4. If 
the sample mean is 4.45, can the samples be regarded as a simple sample? 


Solution 


Null Hypothesis (Ho) : Sample can not be regarded as having been drawn from the population 
with mean 4. 


Alternative Hypothesis (H1): Sample be regarded as having been drawn from population with 
mean 4. 


Population mean (p) = 4. Sample mean (X ) = 4.45 
variance (0) =4 >0=2 
X- 45- 
eS Sal case 
a/n 2/,/400 


As the calculated value of | z | > Za = 1.96 at 5% level of significance, Ho is rejected i.e. sample 


can not be regarded as laying been drawn from the population with mean 4. 


Problem 


A simple sample of 1,000 members is found to have a mean 3.42 cm. Could it be reasonably 
regarded as a simple sample from a large population arose mean is 3.30 cm & S.D is 2.6 cm? 
Given zo.05 = 1.96. 


Answer: z < zo.o5 i.e. 1.46 < 1.96. Ho is accepted. So we say that the sample is drawn from 
normal population. 


Example (JNTUH 2005, 2019, 5 marks) 


Assuming that o = 20.0, how large a random sample be taken to assert with probability 0.95 
that the sample mean will not differ from the true mean by more than 3.0 points? 
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Solution 
a= ae 
olNn 


Given Zo.5 = 1.96; ; o = 20 


Putting these values in above formula 


We get n=171 


Example (JNTUH 2010, 2013, 2015, 2017, 2019, 5 marks) 
A normal population has a mean of 0.1 and standard deviation of 2.1. Find the probability that 
mean of a sample of size 900 will be negative. 
Solution 
Given n=0.1; 0 =2.1;n=900 

pa lh ed 

ao/Nn 2.1/V900 0.07 

Now P(x <0) = P(0.14+.0.07z < 0) 


= pfe- =P z<-2 |= P(e <—1.43) 
0.07 7 


= P(z>1.43)=0.5—P(O<z<1.43) 
= 0.5 - 0.4236 [From table] 
= 0.0764 


Example (JNTUH 2005, 2019, 5 marks) 


In a random sample of 60 workers, the average time taken by them to get to work is 33.8 
minutes with a standard deviation of 6.1 minutes. Can we reject the null hypothesis u =32.6 
minutes in favour of alternative null hypothesis u > 32.6 at a = 0.025 level of significance. 


Solution 
Null Hypothesis Ho = u = 32.6 
Alternative hypothesis Hı = u > 32.6 


Level of significance: a = 0.025 


The test statistic 
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7a cH _ 33.8-32.6 
alVn 6.11/60 


Tabulated value of z at 0.025 level of significance is 2.58. 


= 1.5238 


Hence, we see that Z < Zo.025 i.e. z < 2.58. 


The null hypothesis Ho will be accepted. 


Test of Significance for Difference of Means of Two Large Samples 

Let x, be the mean of a sample of size nı from a population with mean 1), and variance o1. Let 
x, be the mean of an independent sample of size n2 from another population with mean uz and 
variance 02. 


The test statistic is given by 


X, —X. 
12% 
a> 2 2 
OC o 
Or Oa 
n Ny 


Example 


Intelligence test of two groups of boys and girls gave the following results: 


Means | SD | Size 
Girls | 75 8 60 
Boys | 73 10 | 100 


Is the difference in mean scores significant? Also test for SD. 

Solution 

Null hypothesis Ho: There is no significant difference between mean scores, i.e., X =X, . 
Hi: xX, #X, 


Under the null hypothesis 


|z HŽ- _ 15-73} _ 1 3949 
GC. c 8 10° 
+ + 
n o on, 60 100 


As the calculated value of | z | < 1.96, the significant value of z at 5% level of significance, Ho 
is accepted i.e., there is no significant difference between mean scores. 
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Test of Significance for the Difference of Standard Deviations of two large 
samples 


If ox and 62 are the standard deviations of two independent samples, then under the null 
hypothesis HQ: ox = 02, i.e., the sample standard deviations don’t differ significantly, the 
statistic 


When o1 and o2 are population standard deviations, then for large sample size 


Example 


Random samples drawn from two countries gave the following data relating to the heights of 
adult males: 


Country A | Country B 
Mean height (inches) | 67.42 67.25 
Standard deviation 2.58 2.50 
Number of samples | 1000 1200 


(i) Is the difference between the means significant? 


(ii) Is the difference between the standard deviations significant? 


Solution 
(i) Null hypothesis: HO = uı = w i.e., sample means do not differ significantly. 
Alternative hypothesis: Hı: wi # [2 (two tailed test) 
= %—X, __ 67.42-67.25 
o? i Oo; [2.58° f 2.50° 
n n, 1000 1200 


Since | z | < 1.96 we accept the null hypothesis at 5% level of significance. 


=1.56 


(i1) Ho: 61 = 62 i.e., the sample standard deviations do not differ significantly. 


Alternative hypothesis: Hi = o1 # 02 (two tailed) 
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eL aa- 238-250 | gag, 


oo o | 2.58? . 2.507 
+ x 
2n 2n, 2x1000 2x1200 


Since | z | < 1.96 we accept the null hypothesis at 5% level of significance. 


STUDENT T-DISTRIBUTION FOR SMALL SAMPLES (N < 30) 


This t-distribution is used when sample size is < 30 and the population standard deviation is 
unknown. 


t-statistic is defined as 


where standard deviation of sample 


be Sa-k’ 
n-—l1 


t-test of significance of the mean of a random sample 


Ho: There is no significant difference between the sample mean x and the population mean np, 
i.e., we use the statistic 


t= where X is mean of the sample. 


Xey 
sIiNn 


and s = Sux, —X)° with degree of freedom (n - 1). 
n=l iz 


If calculated t value is such that | t | < ta the null hypothesis is accepted. |t | > ta, Ho is rejected. 


Example 
The lifetime of electric bulbs for a random sample of 10 from a large consignment gave the 
following data: 

Item 1 |2 |3 |4 |5 |6 |7 |8 |9 J10 


Life in "000" hrs | 4.2 | 4.6 | 3.9 | 4.1 | 5.2 | 3.8 | 3.9 | 4.3 | 4.4 | 5.6 


Can we accept the hypothesis that the average lifetime of a bulb is 4000 hrs? For df = 9, to.os = 
2.26. 
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Solution 


Ho: There is no significant difference in the sample mean and population mean. i.e., u = 


4000 hrs. 
z X 
H2 Eaa 
n 10 
X 42 |46 |39 141 152 13.8 13.9 143 |4.41]5.6 


X-X -0.2 | 0.2 | -0.5 | -0.3 | 0.8 | -0.6 |-0.5 |-0.1 |O | 1.2 
(X —X) | 0.04 | 0.04 | 0.25 | 0.09 | 0.64 | 0.36 | 0.25 | 0.01 (O | 1.44 


From table, $ (X -XY =3.12 


x= i) 
= pm ye BWP nase 
n-l 9 
X-u 


_ 44-4 2.123 


t= = — | = L. 
s/Vn  0.589/J10 


Degree of freedom =n-1=10-1=9 
Given at df = 9, to.os = 2.26. 


Since the calculated value of t is less than table to.0s. 


.. The hypothesis u = 4000 hrs is accepted, i.e., the average lifetime of bulbs could be 4000 hrs. 


Example (GTU NM 2020, 5 marks) 


The heights of 9 males of a given locality are found to be 45, 47, 50, 52, 48, 47, 49, 53, 51 
inches. Is it reasonable to believe that the average height is differ significantly from assumed 
mean 47.5 inches? (Given that for 5% level of significance t for 7 d. f is 2.365, for 8 d. f. is 
2.306 and for 9 d. f. is 2.262) 


Solution 
The null hypothes is Ho: u = 47.5 


Alternative hypothesis Hi: u 4 47.5 


y- See -49.11 


X 45 47 50 |52 |48 47 49 53 51 | Sum 
X-X -4.11 | -2.11 | 0.89 | 2.89 | -1.11 | -2.11 | -0.11 | 3.89 | 1.89 
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(X-X) | 16.89 0.79 | 8.35 | 1.23 | 4.45 | 1.21 | 15.13 | 3.57 | 56.07 
aD X-X _| 
xi ‘5 5607 =a 
n—-1 
ae 49.11-47.5 _ 


sin 2.49/10 — 
Degree of freedom =n-1=10-1=9 
Given at df= 9, toos = 2.306. 
Since the calculated value of t is less than table tos. 


< The null hypothes Ho is u = 47.5 will be accepted. 


Example 


A sample of 20 items has mean 42 units and standard deviation 5 units. Test the hypothesis that 
it is a random sample from a normal population with mean 45 units. 


Solution 


Ho: There is no significant difference between the sample mean and the population 
mean. i.e., u = 45 units 


H1: u #45 (Two tailed test) 
Given: n = 20, X = 42, S = 5; df=20-1=19 


pee a Jor = 26.31> s =5.129 
20-1 


X-u 42-45 
porr 5129/420 — 


The tabulated value of t at 5% level for 19 d.f. is to.o5 = 2.09. 


= —2.615;|t |= 2.615 


Since |t | > to.os, the hypothesis Ho is rejected, i.e., there is significant difference between the 
sample mean and population mean. i.e., the sample could not have come from this population. 


t-test for difference of means of two small samples 


Ho: The samples have been drawn from the normal population with means 1; and w, 
i.e., Ho: wi # p2. 


Let X,Y be their means of the two samples. 


Under this Ho the test of statistic t is given by 
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and df = nj + n2—2 
If the two sample’s standard deviations sı, s2 are given, then we have 


2 2 
2 ms +1485 


n +n, 2 


Example 


Two samples of sodium vapor bulbs were tested for length of life and the following results were 
obtained: 


Size | Sample mean | Sample SD 
Typel |8 1234 hrs 36 hrs 
Type II | 7 1036 hrs 40 hrs 


Is the difference in the means significant to generalize that Type I is superior to Type II 
regarding length of life? to.os at df 13 is 1.77 (one tailed test). 


Solution 
Ho: [1 = w i.e., two types of bulbs have same lifetime. 


Hi: uı > w i.e., type I is superior to Type IL. 


p2 = SL tS, _ 836)" +740) 
n +n 2 8+7-2 


=1659.076 > s = 40.7317 
X,-X, _ 1234-1036 
s fiat 40.1480 Eat 
n n, 8 7 


df=8+7-2=13 
Given to.os at df 13 is 1.77 (one tailed test). 


t= =18.1480 


Since calculated |t | > to.os, Ho is rejected, i.e. Hi is accepted. 


. Type Lis definitely superior to Type II. 


Example (DU 2003, JNTUH 2019, 5 marks) 


The mean life of a sample of 10 electric bulbs was found to be 1456 hours with S.D. of 423 
hours. A second sample of 17 bulbs chosen from a different batch showed a mean life of 1280 
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hours with S.D. of 398 hours. Is there a significant difference between the means of two 
batches? 


Solution 
Given that 
ni = 10; x = 1456; sı = 423 
m = 17; x, =1280; s2 = 398 
Null hypothesis Ho: uı = u2 i.e. there is no significant difference between the means of two 
samples. 


Alternative hypothesis Hi: uı # u2 (Two tailed test) 
2 2 2 2 
ga Putns _ [10x423 +17x398" _ 423.42 
n +n, -2 10+17-2 
Standard error of difference 


SE(x, —x,)=S 44 -42342 Caer ans 
no Nn 10 17 


%,-X, _ 1456-1280 
SE@-X,) 168.52 


Test statistics t= =1.04 


Level of significance: Take a = 0.05 
The tabulated or critical value of to.o5 = 2.06 (for df = 10 + 17 - 2 = 25) 
The calculated value of |t| = 1.04 < to.os. Hence null hypothesis Ho will be accepted. 


CHI-SQUARE (x?) TEST 


When a coin is tossed 200 times, the theoretical considerations lead us to expect 100 heads and 
100 tails. But in practice, these results are rarely achieved. The quantity y? (a Greek letter, 
pronounced as chi-square) describes the magnitude of discrepancy between theory and 
observation. 

IFO G= 1, 2, ...... , n) is a set of observed (experimental) frequencies and Ej (i = 1, 2, ...... , n) is 
the corresponding set of expected (theoretical or hypothetical) frequencies, then, x? is defined 


2 (O, -E 
x -$| 


as 


i=] i 
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Degree of freedom 


The number of degrees of freedom is the total number of observations less the number of 
independent constraints imposed on the observations. Degrees of freedom (difference) are 
usually denoted by v (the letter ‘nu’ of the Greek alphabet). 


Example 


If we have to choose any four numbers whose sum is 50, we can exercise our independent 
choice for any three numbers only, the fourth being 50 minus the total of the three numbers 
selected. Thus, though we were to choose any four numbers, our choice was reduced to three 
because of one condition imposed. 


There was only one restraint on our freedom and our degrees of freedom were (n - 1) = 4- 1 = 
3. If two restrictions are imposed, our freedom to choose will be further curtailed and degrees 
of freedom will be 4-2 = 2. 


Example (PTU 2011, AKTU 2019, 3.5 marks) 
A die is thrown 270 times and the results of these throws are given below: 


Number appeared on the diej 1 |2 |3 |4 |5 |6 
Frequency 40 | 32 | 29 | 59 | 57 | 59 


Test whether the die is biased or not. Tabulated value of y? at 5% level of significance for d.f. = 
5 is 11.09. 


Solution 

Null hypothesis Ho: Die is unbiased. 

Under this Ho, the expected frequencies for each digit is 276/6 = 46. 
To find the value of 7 


Oi 40 |32 |29 |59 |57 |59 
Ei 46 |46 |46 |46 |46 |46 
(Oi - Ei)” | 36 | 196 | 289 | 169 | 121 | 169 


O -EYF 
2 2 (0 -E) 980730 
E, 46 


Now 


Tabulated value of 7 at 5% level of significance for (n - 1) = 6 — 1 = 5 d.f. (degree of freedom) 
is 11.09. Since the calculated value of y2 = 21.30 > 11.07 the tabulated value, Ho is rejected. 
i.e., die is not unbiased or die is biased. 
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Example (GTU NM 2020, 5 marks) 


In an experiment on immunization of cattle from tuberculosis the following result were 
obtained: 


Affected | Unaffected 
Inoculated 12 26 
Not inoculated | 16 6 


Examine the effect of vaccine in controlling the susceptibility to tuberculosis. (Given that for 
5% level of significance 7 for 1 d. fis 3.841, for 2 d. f. is 5.99 and for 3 d. f is 7.815). 


Solution 


We set up the hypothesis that vaccine has no effect in controlling susceptibility to tuberculosis. 
On this hypothesis the expected frequencies are as shown in the following table: 


Affected Unaffected | Total 
Inoculated 38 x 28/60 = 18 | 38 - 18 = 20 | 38 
Not inoculated | 28 - 18 = 10 22-10=12 | 22 


Total 28 32 60 
= 2 = 2 = 2 E 2 
pe (12-18) pi (26-20) P (16-10) = (6-12) -10.44 
18 20 10 12 


Now the tabulated value of 5% level of significance for one degree of freedom is found to be 
3.841. Since the calculated value is greater than this value, the hypothesis is wrong and 
consequently the variance is effective in controlling susceptibility to tuberculosis. 

Example (UTU 2013, 2016, 10 marks): 


The following frequency distribution gives the frequencies of seeds in a pea breeding 


experiment: 
Round & yellow | Wrinkled & yellow | Round & green | Wrinkled & green | Total 
315 101 108 32 556 


Theory predicts that the frequencies should be 9:3:3:1. Examine the correspondence between 
theory and experiment. 


Solution 


Taking the hypothesis that the theory fits well into the experiment, the expected frequencies are 
respectively 
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2 9556: ee 2 x556, L x556 
16 16 16 16 


i.e. 313, 104, 104, 35. 
Thus 

y5 (of _ C1531) . gorio „oeoo OH) E 
O | 315 | 101 | 108 | 32 
E |313 | 104 | 104 | 35 


Degrees of freedom = 4-1=3 
Table value of y? for 3 d.f. at 5% level of significance = 7.815. 


Since the calculated value of y? is much less than the table value, the hypothesis may be 
accepted. Hence there is much correspondence between theory and experiment. 


Example (GTU NM 2020, 7 marks) 
Five coins are tossed 3200 times and the following results are obtained: 


No. of heads |O | 1 2 3 4 5 
Frequency | 80 | 570 | 1100 | 900 | 500 | 50 


If x? for 5 d.f at 5% level of significance be 11.07, test the hypothesis that the coins are 
unbiased. 


Solution 
Let Ho : Coins are unbiased that is, p = q = 1/2 


Apply binomial probability distribution to get the expected number of heads as follows: 
1 r 1 5-r 1 5 
Expected number of heads = n"C,p'q"" = 3200°C, Gq Gq = 3200°C, Gq 


The table is given below: 


O E |(O-E)|(O-E/)/E 
80 | 100 | 400 4.00 
570 | 500 | 4900 9.80 
1100 | 1000 | 10,000 | 10.00 
900 | 1000 | 10,000 | 10.00 
500 | 500 0 0.00 
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50 100 2500 25.00 
Total 58.80 


Since x7 = 58.80 is more than its critical value y? = 11.07 for df = 6 - 1 = 5 and a = 0.05, the 
null hypothesis is rejected. 


Problem 


The theory predicts the proportion of beans in the four groups A, B, C and D should be 9:3:3:1. 
In an experiment among 1600 beans, the numbers in the four groups were 882, 313, 287 and 
118. Does the experimental result support the theory ? (£ for 3 d.f. at 5% level = 7.815). 


Answer: 7° = 4.7266 < table value, hypothesis may be accepted. 


Problem 


The following table shows the distribution of digits in numbers chosen at random from a 
telephone directory: 


Digits 0 1 2 3 4 3 6 7 8 9 
Frequency | 1026 | 1107 | 997 | 966 | 1075 | 933 | 1107 | 972 | 964 | 853 


Test whether the digits may be taken to occur equally frequently in the directory. The tabulated 
value of 7 at 5% level of significance for 9 difference is 16.919. 


Answer: Ho is rejected. The digits taken in the directory do not occur equally frequently. 


Example (UTU 2015, 5 marks) 


The weight of a drug produced by Ganga Pharmaceutical Co. follows normal distribution. The 
specified variance of the weight of the drug of this population is 0.25 mg. The quality engineer 
of the firm claims that the variance of the weight of the drug does not differ significantly from 
the specified variance of the weight of the drug of the population. So, the purchase officer of 
the Alpha Hospital who places order for that drug with the Ganga Pharmaceutical Co. has 
selected a random sample of 12 drugs. The variance of the weight sample is found to be 0.49 
mg. Verify the intuitions of the quality manager of Ganga Pharmaceutical Co. at a significance 
level of 0.10 using Chi Square. (Tabulated chi square value is 19.675 with 11 dof). 


Solution 


Weight of the drug follows normal distribution. We also have population variance of the weight 
of drugs o? = 0.25 mg; sample size, n = 12; sample variance of the weight of drugs, S? = 0.49 
mg; and significance level, a = 0.10. 


Null and alternate hypotheses are: 
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Ho: o° = 0.25 
Hi: 0° #4 0.25 
It is a two tailed test. 
The chi-square statistic to test the variance is: 
oe (n DE _ (12-1)x0.49 
o 0.25 
df = 12 -1 = 11, 70.10 = 19.675 


=21.56 


We see y? > 170.10 


Hence, null hypothesis Ho will be rejected. 


Problem 


The weight of cement bags produced by Vignesh Cement Company follows normal distribution. 
The quality assistant at the final inspection section of the company feels that the variance of the 
weight of the cement bags has increased from a specified maximum variance of 0.64 kg which 
will lead to customer complaints. Hence, he has selected a sample of 8 cement bags and found 
that the variance of the sample is 0.36 kg. Check the intuition of the quality assistant at a 
significant level of 0.01. Given 70.01 = 1.239. 


Answer: £ = 3.938, Ho will be accepted. Left tailed test. 


F TEST 


In testing the significance of the difference of two means of two samples, we assumed that the 
two samples came from the same population or populations with equal variance. The object of 
the F-test is to discover whether two independent estimates of population variance differ 
significantly or whether the two samples may be regarded as drawn from the normal 
populations having the same variance. Hence before applying the t-test for the significance of 
the difference of two means, we have to test for the equality of population variance by using the 
F-test. 


To test whether these estimates, si” and s2’, are significantly different or if the samples may be 
regarded as drawn from the same population or from two populations with same variance 0°, 
we set-up the null hypothesis 

Ho: 61° = 02” = 0”, 
i.e., the independent estimates of the common population do not differ significantly. 
To carry out the test of significance of the difference of the variances we calculate the test 


statistic F = s? /s> the Numerator is greater than the Denominator, i.e., s17 > s2. 
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If the calculated value of F exceeds Fo.0s for (nı — 1), (n2 — 1) degrees of freedom given in the 
table, we conclude that the ratio is significant at 5% level. 


Example (BPUT 2020, 16 marks) 

In a sample of 8 observations, the sum of squared deviation of items from the mean was 94.5. 
In other samples of 10 observations, the value was found by 101.7. Test whether the difference 
is significant at 5% level of significance? 

Solution 

Ho: sı = s2; Hi: $1 # 82 


Given that mi =8; D(x,—%,)° =94.5 


no = 10; D(x, -,)? =101.7 


»_ D-a _ 94.5 


Now si =13.5 
n -—1 7 
-x ) 101.7 
and peña a a T iig 
iol 9 
2 
poe ies 
s; ll 


df = 7 and 9, Fo.o5 = 3.29 


The calculated value of F is less than table value. Hence, we accept the hypothesis and 
conclude that the difference in the variances of two samples is not significant at 5% level. 


Problem 
Two independent sample of sizes 7 and 6 had the following values: 

Sample A | 28 | 30 | 32 | 33 | 31 | 29 | 34 
Sample B | 29 | 30 | 30 | 24 | 27 | 28 


Examine whether the samples have been drawn from normal populations having the same 


variance. 


Solution 


a 


HO: The variance are equal. i.e., 617 = 62” i.e., the samples have been drawn from 


normal populations with same variance. 


H1: 01? £02” 
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Xi | X-X | (X,-X,)* | X2| X, -X, | (X,-X.,y 
28 | -3 9 29 | 1 1 
30 | -1 1 30 | 2 4 
33 (| 1 1 30 | 2 
33 | 2 4 24 | -4 16 
31 | 0 0 27 | -1 1 
29 | -2 4 28 |0 0 
34 | 3 9 
28 26 


X, =31n, =7;D(X,-X,) =28 
X, =28;n, =6;}(X,-X,) =26 


>» X(X, -XY _ 28 


s = = 4.666 
nal 6 
_ yy 
gu DE) 2835 
n, —1 5 
2 
io 
s3 5.2 
>s 


The tabulated value of F at df1 = 6 — 1 and df2 = 7 — 1 difference for 5% level of significance is 
4.39. The calculated value of F is less than the tabulated value of F. Hois accepted. Hence, there 
is no significant difference between the variance. The samples have been drawn from the 
normal population with same variance. 


Problem 

Two independent samples of sizes 9 and 8 gave the sum of squares of deviations from their 
respective means as 160 and 91 respectively. Can the samples be regarded as drawn from the 
normal populations with equal variance? Given: F0.05 (8, 7) = 3.73 and F0.05 (7, 8) = 3.50 
Answer: Fai = 3.029 < Fo.os (8, 7) = 3.73 (given) 

The Null Hypothesis~ Ho is accepted. The samples may be regarded as drawn from the normal 
population with equal variance 
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ASSIGNMENT 


Q.1. (JNTUH 2018, 2 marks): Write the conditions of validity of x? test. 
Q.2. (JNTUH 2018, 2 marks): Construct sampling distribution of means for the populations 3, 7, 11, 15 by 


drawing samples of size two without replacement. Determine (i) u (ii) o (iii) sampling distribution of means. 
Q.3. (JNTUH 2018, 2 marks): Discuss types of errors of the test of hypothesis. 


Q.4. (JNTUH 2019, 2 marks): A random sample of size 100 has a standard deviation of 5. What can you say 
about maximum error with 95% confidence? 


Q.5. (JNTUH 2019, 2 marks): Define central limit theorem 

Q.6. (JNTUH 2019, 2 marks): Define Type I and Type II errors. 

Q.7. (JNTUH 2019, 2 marks): Explain one way classification of ANOVA. 

Q.8. (JNTUH 2018, 5 marks):Discuss critical region and level of significance with example 


Q.9. (JNTUH 2018, 5 marks): Explain why the larger variance is placed in the numerator of the statistic F. 
Discuss the application of F-test in testing if two variances are homogenous. 


Q.10. (JNTUH 2015, 2018, 5 marks): A sample of 11 rats from a central population had an average blood 
viscosity of 3.92 with a standard deviation of 0.61. Estimate the 95% confidence limits for the mean blood 
viscosity of the population. 


Answer: df = 11 - 1 = 10; tos at df 10 = 2.23 
S=SD=0.61;n=11, x =3.92 


Confidence limits are X + ty; (=) =3.92+ 223( 2 | = (3.51, 4.33) 


vn Vil 


Q.11. (AKTU 2018, 2020, 7 marks): Find the measure of Skewness and kurtosis based on moments for the 
following distribution and draw your conclusion 


Marks 5-15 | 15-25 | 25-35 | 35-45 | 45-55 
No. of students | 1 3 5 7 4 
Q.12. (AKTU 2019, 3.5 marks): A die is thrown 276 times and the results of those are given below: 
No. appeared on the die | 1 |2 |3 |4 |5 | 6 
Frequency 40 | 32 | 29 | 59 | 57 | 59 


Test whether the die is biased or not. [Tabulated value of 7 ? at 5% level of significance for 5 degree of freedom is 
11.09]. 

Answer: Solved in this module. 

Q.13. (GTU NM 2020, 7 marks): Five coins are tossed 3200 times and the following results are obtained: 

No. ofheads | O | 1 2 3 4 5 
Frequency 80 | 570 | 1100 | 900 | 500 | 50 


If x? for 5 d.f at 5% level of significance be 11.07, test the hypothesis that the coins are unbiased. 


Answer: Solved in this module. 
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Q.14. (GTU NM 2020, 5 marks): In an experiment on immunization of cattle from tuberculosis the following 
result were obtained: 


Affected | Unaffected 
Inoculated 12 26 


Not inoculated | 16 6 


Examine the effect of vaccine in controlling the susceptibility to tuberculosis. (Given that for 5% level of 
significance x? for 1 d. f is 3.841, for 2 d. f. is 5.99 and for 3 d. f is 7.815). 
Answer: Solved in this module. 


Q.15. (GTU NM 2020, 5 marks): The heights of 9 males of a given locality are found to be 45, 47, 50, 52, 48, 47, 
49, 53, 51 inches. Is it reasonable to believe that the average height is differ significantly from assumed mean 47.5 
inches? (Given that for 5% level of significance t for 7 d. f is 2.365, for 8 d. f. is 2.306 and for 9 d. f. is 2.262) 


Answer: Solved in this module. 


Q.16. (BPUT 2020, 6 marks): In a sample of 1000 people in Odisha, 540 are rice eater and rest are wheat eaters. 
Can we assume that both rice and wheat are equally popular in this state at 1% level of significance? 


Answer: A similar problem is solved in this module. 


Q.17. (BPUT 2020, 16 marks): In a sample of 8 observations, the sum of squared deviation of items from the 
mean was 94.5. In other samples of 10 observations, the value was found by 101.7. Test whether the difference is 
significant at 5% level of significance? 


Answer: Solved in this module. 


Q.18. (JNTUH 2018, 10 marks): A sample of 400 items is taken from a normal population whose mean is 4 and 
variance 4. If the sample mean is 4.45, can the samples be regarded as a simple sample? 


Answer: Solved in this module. 


Q.19. (JNTUH 2018, 5 marks): In a sample of 600 students of a certain college 400 are found to use ball pens. In 
another college from a sample of 900 students 450 were found to use ball pens. Test whether two colleges are 
significantly different with respect to the habit of using ball pens? 


Answer: z = 6.38 Reject Ho. 
Hint: Testing of Significance for Difference of Proportions. See module for formula. 


Q.20. (JNTUH 2019, 5 marks): Assuming that o = 20.0, how large a random sample be taken to assert with 
probability 0.95 that the sample mean will not differ from the true mean by more than 3.0 points? 


Answer: Solved in this module. 


Q.21. (JNTUH 2019, 5 marks): A normal population has a mean of 0.1 and standard deviation of 2.1. Find the 
probability that mean of a sample of size 900 will be negative. 


Answer: Solved in this module. 


Q.22. (JNTUH 2009, 2010, 2019, 5 marks): Find 95% confidence limits for the mean of a normality distributed 
population from which the following sample was taken 15, 17, 10, 18, 16,9, 7, 11, 13, 14. 


Answer: 
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2 
e229 £5 (15313) 40713)? ean, +(14-13)?)=40/3 
n-l 9 


Z0.5 = 1.96 


Confidence limits are X + Z, (=| =134 DEJ =(13+ 2.26) = (10.74,15.26) 


vn vio 


Q.23. (JNTUH 2019, 5 marks): In a random sample of 60 workers, the average time taken by them to get to work 
is 33.8 minutes with a standard deviation of 6.1 minutes. Can we reject the null hypothesis u =32.6 minutes in 
favour of alternative null hypothesis u > 32.6 at a = 0.025 level of significance. 


Answer: Solved in this module. 


Q.24. (JNTUH 2019, 5 marks): The mean life of a sample of 10 electric bulbs was found to be 1456 hours with 
S.D. of 423 hours. A second sample of 17 bulbs chosen from a different batch showed a mean life of 1280 hours 
with S.D. of 398 hours. Is there a significant difference between the means of two batches? 


Answer: Solved in this module. 


Q.25. (UTU 2013, 2016, 10 marks): The following frequency distribution gives the frequencies of seeds in a pea 
breeding experiment: 


Round & yellow | Wrinkled & yellow | Round & green | Wrinkled & green | Total 
315 101 108 32 556 


Theory predicts that the frequencies should be 9:3:3:1. Examine the correspondence between theory and 
experiment. 


Answer: Solved in this module. 


Q.26. (UTU 2013, 5 marks): Before an increase in excise duty on tea,-800 people out of 1000 persons were found 
to be tea drinkers. After an increase in duty 800 persons were known to be tea drinkers in a sample of 1200. Do 
you think that there has been significant decrease in the consumption of tea after the increase in excise duty? 


Answer: Solved in this module. 


Q.27. (UTU 2015, 5 marks): The weight of a drug produced by Ganga Pharmaceutical Co. follows normal 
distribution. The specified variance of the weight of the drug of this population is 0.25 mg. The quality engineer of 
the firm claims that the variance of the weight of the drug does not differ significantly from the specified variance 
of the weight of the drug of the population. So, the purchase officer of the Alpha Hospital who places order for 
that drug with the Ganga Pharmaceutical Co. has selected a random sample of 12 drugs. The variance of the 
weight sample is found to be 0.49 mg. Verify the intuitions of the quality manager of Ganga Pharmaceutical Co. at 
a significance level of 0.10 using Chi Square. (Tabulated chi square value is 19.675 with 11 dof). 


Answer: Solved in this module. 


